Morpho Challenge - Evaluation of algorithms for unsupervised learning of morphology in various tasks and languages
نویسندگان
چکیده
After the release of the open sou rce softw are implementation of M orfessor alg orithm, a series of several open evalu ations has b een org aniz ed for u nsu pervised morpheme analy sis and morpheme-b ased speech recog nition and information retrieval. T he u nsu pervised morpheme analy sis is a particu larly attractive approach for speech and lang u ag e technolog y for the morpholog ically complex lang u ag es. W hen the amou nt of distinct w ord forms b ecomes prohib itive for the constru ction of a su ffi cient lex icon, it is important that the w ords can b e seg mented into smaller meaning fu l lang u ag e modeling u nits. In this presentation w e w ill demonstrate the resu lts of the evalu ations, the b aseline sy stems b u ilt u sing the open sou rce tools, and invite research g rou ps to participate in the nex t evalu ation w here the task is to enhance statistical machine translation b y morpheme analy sis.
منابع مشابه
Proceedings of the Morpho Challenge 2010 Workshop
In natural language processing many practical tasks, such as speech recognition, information retrieval and machine translation depend on a large vocabulary and statistical language models. For morphologically rich languages, such as Finnish and Turkish, the construction of a vocabulary and language models that have a sufficient coverage is particularly difficult, because of the huge amount of d...
متن کاملEvaluating an Agglutinative Segmentation Model for ParaMor
This paper describes and evaluates a modification to the segmentation model used in the unsupervised morphology induction system, ParaMor. Our improved segmentation model permits multiple morpheme boundaries in a single word. To prepare ParaMor to effectively apply the new agglutinative segmentation model, two heuristics improve ParaMor’s precision. These precision-enhancing heuristics are adap...
متن کاملRetrieval Experiments at Morpho Challenge 2008
Morpho Challenge 2008 hosted an extrinsic evaluation of morphological analysis that explored whether unsupervised morphology induction could benefit information retrieval. This paper presents results in alternative methods for word normalization using test sets from the Cross-Language Evaluation Forum (CLEF) ad-hoc collections. Preliminary results for the Morpho Challenge 2008 evaluation are co...
متن کاملOverview and Results of Morpho Challenge 2009
The goal of Morpho Challenge 2009 was to evaluate unsupervised algorithms that provide morpheme analyses for words in different languages and in various practical applications. Morpheme analysis is particularly useful in speech recognition, information retrieval and machine translation for morphologically rich languages where the amount of different word forms is very large. The evaluations con...
متن کاملMorpho Challenge competition 2005-2010: Evaluations and results
Morpho Challenge is an annual evaluation campaign for unsupervised morpheme analysis. In morpheme analysis, words are segmented into smaller meaningful units. This is an essential part in processing complex word forms in many large-scale natural language processing applications, such as speech recognition, information retrieval, and machine translation. The discovery of morphemes is particularl...
متن کاملUnsupervised Morpheme Analysis Evaluation by a Comparison to a Linguistic Gold Standard - Morpho Challenge 2008
The goal of Morpho Challenge 2008 was to find and evaluate unsupervised algorithms that provide morpheme analyses for words in different languages. Especially in morphologically complex languages, such as Finnish, Turkish and Arabic, morpheme analysis is important for lexical modeling of words in speech recognition, information retrieval and machine translation. The evaluation in Morpho Challen...
متن کامل